Discovering Relations Among GO-Annotated Clusters by Graph Kernel Methods
نویسندگان
چکیده
The biological interpretation of large-scale gene expression data is one of the challenges in current bioinformatics. The state-of-theart approach is to perform clustering and then compute a functional characterization via enrichments by Gene Ontology terms [1]. To better assist the interpretation of results, it may be useful to establish connections among different clusters. This machine learning step is sometimes termed cluster meta-analysis, and several approaches have already been proposed; in particular, they usually rely on enrichments based on flat lists of GO terms. However, GO terms are organized in taxonomical graphs, whose structure should be taken into account when performing enrichment studies. To tackle this problem, we propose a kernel approach that can exploit such structured graphical nature. Finally, we compare our approach against a specific flat list method by analyzing the cdc15subset of the well known Spellman’s Yeast Cell Cycle dataset [2].
منابع مشابه
Automatic Multimedia Knowledge Discovery, Summarization and Evaluation
This paper presents novel methods for automatically discovering, summarizing and evaluating multimedia knowledge from annotated images in the form of images clusters, word senses and relationships among them, among others. These are essential for applications to intelligently, efficiently and coherently deal with multimedia. The proposed methods include automatic techniques (1) for constructing...
متن کاملLearning the Graph of Relations Among Multiple Tasks
We propose multitask Laplacian learning, a new method for jointly learning clusters of closely related tasks. Unlike standard multitask methodologies, the graph of relations among the tasks is not assumed to be known a priori, but is learned by the multitask Laplacian algorithm. The algorithm builds on kernel based methods and exploits an optimization approach for learning a continuously parame...
متن کاملDiscovering Relations among Named Entities from Large Corpora
Discovering the significant relations embedded in documents would be very useful not only for information retrieval but also for question answering and summarization. Prior methods for relation discovery, however, needed large annotated corpora which cost a great deal of time and effort. We propose an unsupervised method for relation discovery from large corpora. The key idea is clustering pair...
متن کاملA graph-theoretic modeling on GO space for biological interpretation of gene clusters
MOTIVATION With the advent of DNA microarray technologies, the parallel quantification of genome-wide transcriptions has been a great opportunity to systematically understand the complicated biological phenomena. Amidst the enthusiastic investigations into the intricate gene expression data, clustering methods have been the useful tools to uncover the meaningful patterns hidden in those data. T...
متن کاملRECOME: A new density-based clustering algorithm using relative KNN kernel density
Discovering clusters from a dataset with different shapes, density, and scales is a known challenging problem in data clustering. In this paper, we propose the RElative COre MErge (RECOME) clustering algorithm. The core of RECOME is a novel density measure, i.e., Relative K nearest Neighbor Kernel Density (RNKD). RECOME identifies core objects with unit RNKD, and partitions non-core objects int...
متن کامل